Discriminatively Trained Acoustic Model for Improving Mispronunciation Detection and Diagnosis in Computer Aided Pronunciation Training (CAPT)
نویسندگان
چکیده
In this study, we propose a discriminative training algorithm to jointly minimize mispronunciation detection errors (i.e., false rejection and false acceptances) and diagnosis errors (i.e., correctly pinpointing mispronunciations but incorrectly stating how they are wrong). An optimization procedure, similar to Minimum Word Error (MWE) discriminative training, is developed to refine the ML-trained HMMs. The errors to be minimized are obtained by comparing transcribed training utterances (including mispronunciations) with Extended Recognition Networks [3] which contain both canonical pronunciations and explicitly modeled mispronunciations. The ERN is compiled by handcrafted rules, or data-driven rules. Several conclusions can be drawn from the experiments: (1) data-driven rules are more effective than hand-crafted ones in capturing mispronunciations; (2) compared with the ML training baseline, discriminative training can reduce false rejections and diagnostic errors, though false acceptances increase slightly due to a small number of false-acceptance samples in the training set.
منابع مشابه
Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT)
In this study, we propose a discriminative training algorithm to jointly minimize mispronunciation detection errors (i.e., false rejections and false acceptances) and diagnosis errors (i.e., correctly pinpointing mispronunciations but incorrectly stating how they are wrong). An optimization procedure, similar to Minimum Word Error (MWE) discriminative training, is developed to refine the ML-tra...
متن کاملAn Application of Modified Confusion Network for Improving Mispronunciation Detection in Computer- aided Mandarin Pronunciation Training
In this paper, we propose an application of confusion network for Mandarin mispronunciation detection. Compared to former published works, which are proven to work effectively and robustly in detecting mispronunciation in word level and only successfully detect mispronunciation in sentence level in strictly small constrained search space, our modified confusion network based Computer-aided Pron...
متن کاملAutomatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training
This paper presents a mispronunciation detection system which uses automatic speech recognition to support computer-aided pronunciation training (CAPT). Our methodology extends a model pronunciation lexicon with possible phonetic mispronunciations that may appear in learners’ speech. Generation of these pronunciation variants was previously achieved by means of phone-tophone mapping rules deriv...
متن کاملOn Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT)
We investigate the use of joint-sequence multigrams to generate L2 mispronunciation lexicons for mispronunciation detection and diagnosis. In the joint-sequence framework, a pair of parallel strings (namely, the input string of either graphemes or phonemes of the canonical pronunciation and the phonetic string of the mispronunciation) are aligned to form joint units for probabilistic estimation...
متن کاملEvaluation Metric-related Optimization Methods for Mandarin Mispronunciation Detection
Mispronunciation detection and diagnosis are part and parcel of a computer assisted pronunciation training (CAPT) system, collectively facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. This thesis presents a continuation of such a general line of research and the major contributions are three-fold. Fir...
متن کامل